Guess where? Actor-supervision for spatiotemporal action localization

نویسندگان

چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Guess Where? Actor-Supervision for Spatiotemporal Action Localization

This paper addresses the problem of spatiotemporal localization of actions in videos. Compared to leading approaches, which all learn to localize based on carefully annotated boxes on training video frames, we adhere to a weakly-supervised solution that only requires a video class label. We introduce an actor-supervised architecture that exploits the inherent compositionality of actions in term...

متن کامل

Actor-independent action search using spatiotemporal vocabulary with appearance hashing

Human actions in movies and sitcoms usually capture semantic cues for story understanding, which offer a novel search pattern beyond the traditional video search scenario. However, there are great challenges to achieve action-level video search, such as global motions, concurrent actions, and actor appearance variances. In this paper, we introduce a generalized action retrieval framework, which...

متن کامل

The multi-item localization (MILO) task: measuring the spatiotemporal context of vision for action.

We describe a new multi-item localization task that can be used to probe the temporal and spatial contexts of search-like behaviors. A sequence of four target letters (e.g., E, F, G, and H) was presented among four distractor letters. Observers located the targets in order. Both retrospective and prospective components of performance were examined. The retrospective component was assessed by ha...

متن کامل

The multi-item localization (MILO) task: Measuring the spatiotemporal context of vision for action

This article introduces a new task for exploring the sequential selection of multiple target items during searchlike behavior. This multi-item localization (MILO) task differs in a number of respects from traditional visual search paradigms and, in particular, places a strong emphasis on the temporal, as well as the spatial, aspects of behavior. We will begin by describing the novel features of...

متن کامل

Spatiotemporal Residual Networks for Video Action Recognition

Two-stream Convolutional Networks (ConvNets) have shown strong performance for human action recognition in videos. Recently, Residual Networks (ResNets) have arisen as a new technique to train extremely deep architectures. In this paper, we introduce spatiotemporal ResNets as a combination of these two approaches. Our novel architecture generalizes ResNets for the spatiotemporal domain by intro...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Computer Vision and Image Understanding

سال: 2020

ISSN: 1077-3142

DOI: 10.1016/j.cviu.2019.102886